The System for Recognizing Chemical Names and Detecting Chemical Passages in Patent Documents

نویسندگان

  • Shuo Xu
  • Weijia Xu
چکیده

One of the tasks in the BioCreative V challenge, the CHEMDNERPatent task, includes three subtasks: CEMP, CPD, and GPRO. We participated in the CEMP and CPD subtasks, and developed a system on the basis of selected open-source NLP, machine learning toolkits. In our system, the CEMP subtask is regarded as a sequence labeling problem, and the CPD subtask is regarded as a text classification problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adapting ChER for the recognition of chemical mentions in patents

ChER (Chemical Entity Recogniser) is a pipeline of natural language processing tools optimised for the recognition of chemical names in scientific abstracts. It formed the basis of our submissions to the previous edition of the CHEMDNER track in BioCreative IV, and was one of the top-performing systems both for the chemical document indexing (CDI) and chemical entity mention recognition (CEM) s...

متن کامل

Extracting and connecting chemical structures from text sources using chemicalize.org

BACKGROUND Exploring bioactive chemistry requires navigating between structures and data from a variety of text-based sources. While PubChem currently includes approximately 16 million document-extracted structures (15 million from patents) the extent of public inter-document and document-to-database links is still well below any estimated total, especially for journal articles. A major expansi...

متن کامل

Information Retrieval and Text Mining Technologies for Chemistry.

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical document...

متن کامل

TREC-CHEM 2010 : Notebook report

The TREC Chemical IR Track is a domain-specific evaluation campaign working with documents containing specific lexica, including chemical formulas and specific names. The 2010 edition of the track also included supporting material in addition to text: images and structure information files. As in the previous year, we had two tasks: a patent focused prior-art (PA) task and a user-focused Techno...

متن کامل

Chemical Engineering Software and Legal Protection Thereof

In recent years, an increasing number of Chemical Engineering Software (CES), which play an important role in improving efficiency in the petroleum industry, has been introduced to the                market. Generally, software is the product of intellectual creativity, but protection of the intellectual property residing in software is the subject of some controversy.  This paper explores the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015